Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Intelligent integration approach of big data for urban infrastructure management and maintenance
LIU Jiajun, YU Gang, HU Min
Journal of Computer Applications    2017, 37 (10): 2983-2990.   DOI: 10.11772/j.issn.1001-9081.2017.10.2983
Abstract719)      PDF (1394KB)(734)       Save
In order to improve the efficiency of data integration, enhance both statistical and decisional analysis performance of the platform and reduce Extract-Transform-Load (ETL) execution time and the burden of data center, according to the operation and maintenance big data with characteristics of high dimension, diversity and variability, a Multilevel Task Scheduling (MTS) ETL framework (MTS-ETL) was proposed for intelligent maintenance requirements. Firstly, the data warehouse was divided into a series of parts, including data temporary area, data storage area, data classification area and data analysis area. In the light of the sub-region, the integral ETL process was divided into four levels of ETL task scheduling. Moreover, the multi-frequency ETL operation scheduling and sequential and non-sequential ETL working modes were designed at the same time. Secondly, the conceptual modelling, logical modelling and physical modelling of data integration were implemented based on the non-sequential mode of MTS-ETL framework. Finally, the ETL transformation module and job module were designed by using Pentaho Data Integration to realize this data integration method. In the traffic flow data integration experiment, the method integrated 136754 data for only 28.4 seconds, and reduced the total average execution time by 6.51% compared to the traditional ETL method in a thousand-scale data integration experiment. The reliability of ETL process was proved by the report analysis results of integrating 4 million data. The proposed method can effectively integrate the operation and maintenance of big data, improve the statistical analysis performance of platform and maintain ETL execution time at a low level.
Reference | Related Articles | Metrics